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(54) Speech encoding device and method having TFO (Tandem Free Operation) function 



(57) .- The intern alstate matching of an encoder when 
switching from TFO mode to tandem connection is 
maintained while suppressing the corresponding in- 
crease in the amount of processing. In the TFO mode, 
PCM data and compressed data transmitted in multi- 
plexed form are demultiplexed by a PCM data/com- 
pressed data demultiplexing unit, and the compressed 
data is selected by a selector for output. At the same 



time, an encoding functional unit continues to encode 
the demultiplexed PCM data so that the internal state 
matching of the encoder can be maintained in case of 
a fallback to the tandem connection. At this time, to al- 
leviate the processing burden of the encoder, part of the 
demultiplexed encoded data, for example, stochastic 
codebook data, is extracted and supplied to the encod- 
ing functional unit. 
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Description 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

[0001] The present invention relates to a speech en- 
coding device having a TFO function, and a method. 

2. Description of the Related Art 

[0002] In recent years, speech codecs that compress 
speech data for transmission have come to compress 
64-kbps speech data in the telephone speech band to 
about 4 kbps to 8 kbps tor transmission. In particular, in ' 
the field of mobile communications, low bit-rate speech 
codecs have come into use for efficient utilization of 
bandwidth, in such speech codecs, speech quality deg- 
radation due to the accumulation of distortion associat- 
ed with compression and decompression, especially in 
the tandem operation of codecs (the configuration here- 
inafter called the tandem connection), has become a 
greater issue than before. 

[0003] It is said that a method called digital one-link 
connection, in which data is transmitted end to end in 
compressed form as it is, is desirable for use with 
speech codecs. However, in mobile-to-mobile connec- 
tions, for example, in, the second generation mobile 
communication systems (such as European GSM, 
North American PCS, and Japan's PDC), a serial oper- 
ation called a tandem connection, and not digital one- 
link connection, occurs. How this occurs will be ex- 
plained with reference to Figure 1. As a speech codec 
intervenes in order to connect a mobile unit 1 2 to a public 
network 10 in a mobile switching center (MLS) 14, the 
compressed data is once converted to 64 kbps PCM 
code even when the destination of the connection is a 
mobile unit 16. This results in a tandem connection in 
which the two speech codecs are connected in serial 
when connecting one mobile unit to the other, and caus- 
es degradation in speech quality. 
[0004] A technique for solving this problem is dis- 
closed in U.S. PatetitNo.5991716orin 3GPP (3rd Gen- 
eration Partnership Project) Technical Specification TS 
28.062. This technique is called Tandem Free Operation 
(TFO) because the tandem connection of codecs is re- 
moved. An overview of this operation is shown in Figure 
2. By bit stealing from G.711 PCM data between TCs 
(Transcoders: codecs) 18 and 20 (the data is obtained 
by local decoding operations at the TCs), and by map- 
ping compressed speech data thereon, the compressed 
data from the terminal is passed through without the TCs 
(codecs) themselves performing re-encoding (re-com- 
pression) operations. This achieves a digital one-link 
between the mobile units. Figure 3 shows the format of 
the data transmitted between the TCs. In this case, the 
six K/lSBs of the PCM data obtained by local decoding 
operations at the TCs are left unchanged, but the two 



LSBs are stolen and the compressed speech data bits 
are embedded therein. 

[0005] The feature of the above TFO method is that 
both the PCM data and the compressed speech data 
5 are transmitted by multiplexing them together, not trans- 
mitting the compressed speech data instead of the PCM 
data. This enables the speech signal to be transmitted 
end to end via a digital one-link connection to the remote 
end even when the remote end is a mobile unit. 
10 [0006] In mobile communications, handover occurs 
as a mobile terminal moves. As shown in Figure 4, dur- 
ing communication via a TFO connection established 
between TC 22 and TC 24 that support TFO, for exam- 
ple, if the mobile terminal 28 moves and a handover oc- 
15 curs from the TC 24 to a TFO non-supporting TC 26, the 
TFO has to be interrupted. To provide for such cases, 
the TC 22 must also be provided with a means for al- 
lowing a fallback from the TFO to the tandem connec- 
tion, that is, a function for encoding PCM data, received 
20 from the TC 26, into compressed speech data so that 
switching can be made from the compressed data pass- 
through mode to the encoding mode in the event of a 
fallback to the tandem connection. Such means is also 
needed so that, in the event of an increased error rate 
25 between the TCs, switching can be made at the receiv- 
ing TC so as to use PCM data less affected by error. 
However, the following problem occurs when effecting 
a fallback to the tandem connection. 
[0007] In recent codecs, prediction schemes have be- 
so come an essential technology for achieving a high com- 
pression ratio, and it is practiced to predict the present 
signal from the past received signal by making use of 
its statistical nature, and to encode only the prediction 
residual. This prediction works well, provided that the 
35 internal state variables are matched between the encod- 
er and.decoder. In fact, when a reset is performed during 
encoding and the resulting compressed speech data is 
processed by the decoder which is not reset, it can be 
confirmed that a signal of maximum amplitude may be 
40 reproduced in certain cases (conversely, resetting only 
the decoder will not cause a significant effect on signal 
reproduction, since the decoder has the robustness that 
allows reproduction from any point in the encoded data). 
[0008] As shown in Figure 5, during the TFO opera- 
45 tion in which the compressed speech data is allowed to 
pass through, the encoder of the receiving TC 22 Is not 
operating, so that its internal state is in a floating state. 
When a fallback to the tandem connection occurs, the 
encoder of the TC 22 is switched in, and this can cause 
50 a problem such as described above in the decoder con- 
tained in the mobile unit 30. 

[0009] One possible method to avoid this problem is 
to continue encoding, at the TC 22, the speech decoded 
. by the right-hand side TC 26 and thereby to prevent the 
55 occurrence of a state mismatch. In another possible 
method, the encoder is not kept operating at all times, 
but when it is detected by a suitable means that a tan- 
dem fallback should be effected, the encoder starts to 
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operate (while stopping the transmission of the encoded 
data for a certain .period of time), before switching is 
made to the tandem connection. 
[0010] However, these methods require that the en- 
coding which involves a large amount of computation 
be performed during the TFO operation and, therefore, 
this defeat the purpose of reducing the amount of 
processing which is a feature of TFO. If the encoder is 
operated only when necessary, this is no different from 
operating the encoder at all times, if the worst case is. 
considered, and this also defeats the purpose of reduc- 
ing the amount of processing. 

SUMMARY OF THE INVENTION* 

[0011] The present invention has been devised to 
solve the above problem In a speech encoder having a 
TFO function, and an object of the invention is to provide 
a speech encoding device and method that can maintain 
internal state matching, while suppressing an increase 
in the amount of processing, to provide far the case of 
a fallback to the tandem connection. 
[0012] According to the present invention, there is 
provided a speech encoding device comprising: means 
for receiving non -compressed speech data and first 
compressed speech data which correspond to the non- 
compressed speech data and which are generated 
through compression coding; an encoder for generating 
second compressed speech data from the non-com- 
pressed speech data in a first operation mode; simplified 
encoding means for supplying part of the first com- 
pressed speech data to the encoder and thereby caus- 
ing the encoder to perform simplified encoding in a sec- 
ond operation mode; and a selector for selecting the first 
compressed speech data for output in the second oper- 
ation mode, and for selecting the second compressed 
speech data for output in the first operation mode. 
[0013] Preferably, the encoder generates the com- 
pressed speech data by code excited linear predictive 
coding, and the simplified encoding means supplies sto- 
chastic code data to the encoder as that part of the com- 
pressed speech data. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0014] 

Figure 1 is a diagram for explaining a tandem con- 
nection of speech codecs; 
Figure 2 is a diagram for explaining TFO; 
Figure 3 is a diagram showing the format of data 
transmitted between TCs in TFO; 
Figure 4 is a diagram for explaining a fallback to the 
tandem connection; 

Figure 5 is a diagram for explaining a problem oc- 
curring when a fallback to the tandem connection 
occurs; 

Figure 6 is a block diagram of a speech encoding 



device based on CELP; 

Figure 7 is a block diagram of a speech encoding 
device according to one embodiment of the present 
invention; 

s Figure 8 is a diagram for explaining a time difference 
between a codec processing unit frame and trans- 
mitted data; 

Figure 9 is a diagram for explaining how time differ- 
ence information is extracted; 
10 Figure 1 0 is a block diagram showing one example 
. of a configuration for accomplishing the extraction 
of the time difference information and the buffering 
control performed based on the extracted informa- 
tion; 

is Figure 11 is a diagram for explaining how the 
amount of delay can be reduced; 
Figure 1 2 is a diagram for explaining the reconstruc- 
tion of a stochastic signal; and 
Figure 13 is a diagram for explaining an example of 

20 buffering in an ACELP-based codec. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

25 [0015] Figure 6 shows the configuration of a speech 
encoding device based on CELP (Code Excited Linear 
Prediction). As is well known, in the speech encoding 
device such as CELP that uses vector quantization, an 
output of a local synthesis part (decoder) 32 and an input 

30 speech vector are added in an adder 34 to compute the 
error between them, and parameters to be applied to 
the local synthesis part 32 are determined such that the 
result of the perceptual weighting applied by a percep- 
tual weighting filter 36 becomes the smallest, the pa- 

35 rameters thus determined being the results of the en- 
coding. At the decoding side, the same computations as 
performed in the local synthesis part 32 are performed 
by using the above parameters to reconstruct a speech 
signal close to the input speech. 

40 [0016] In the present invention, in the TFO (Tandem 
Free Operation) mode also, that is, in the operation 
mode in which the compressed speech data, demulti- 
plexed from the multiplexed signal carrying the PCM da- 
ta and the compressed speech data, is passed un- 

45 changed, the encoder keeps on encoding and com- 
pressing the PCM data demultiplexed from the multi- 
plexed signal, thereby maintaining the internal state of 
the encoder close to that of the encoder that produced 
the compressed speech data and thus providing for a 

so fallback to the tandem connection; at the same time, to 
alleviate the burden of the encoder, part of the com- 
pressed speech data demultiplexed from the multi- 
plexed signal is used as part of the parameters neces- 
sary for the local synthesis 32 performed within the en- 

55 coder. 

[0017] The parameters necessary for the local syn- 
thesis include: a filter coefficient for an LPC synthesis 
filter 40, which is obtained by a linear prediction analysis 
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38 of the input speech; the value of pitch to be supplied 
to an adaptive codebook 42 which reproduces a voiced 
sound; an index value to be supplied to a stochastic 
codebook 44 which reproduces an unvoiced sound; and 
the gain of the voiced and unvoiced sounds to be sup- 5 
plied to a gain element 46. Any of these parameters may 
be derived from the compressed speech signal demul- 
tiplexed from the multiplexed signal; here, the output of 
stochastic, codebook 44 is a component signal to which 
prediction cannot be applied, and there is no other way 
but to search for its index value by using a heuristic al- 
gorithm and, besides, there is no stored value as a state 
variable. Deriving this parameter from the compressed 
speech signal is therefore the simplest and its effective- 
ness is the greatest of all of the above parameters. More 1 
specifically, when deriving the index value for the sto- 
chastic codebook 44 from the data demultiplexed from 
the multiplexed signal, it is only necessary to switch to 
that data, and this eliminates the need for searching for 
the index value by using the heuristic algorithm in a dis- 
tortion minimizing optimum searching unit 48. 
[0018] Figure 7 shows the configuration of one em- 
bodiment of a speech encoding device based on the 
above concept according to the present invention. 
[0019] The input signal to the encoding device is of 
the f ormat shown in Figure 3 and contains the PCM data 
decoded at the remote-end TC and the compression- 
encoded data passed unchanged through the remote- 
end TC. A PCM data/compressed data demultiplexing 
unit 50 demultiplexes these two kinds of signals. The 
demultiplexed PCM data is again encoded and com- 
pressed by an encoding functional unit 52 contained in 
the encoding device. In the event of a fallback to the 
tandem connection, the output of the encoding function- 
al unit 52 is selected by a selector 54 for output. 
[0020] On the other hand, during TFO, the demulti- 
plexed compression-encoded data is selected by the 
selector 54 for output; at this time, part of the data, for 
example, the index for the stochastic codebook, is ex- 
tracted by an encoded data selective extraction unit 56. 
, The extracted encoded data is selected by a selector 58 
and supplied to the encoding functional unit 52. As a 
result, during TFO^the encoding functional unit 52 is 
spared the necessity of performing part of the process, 
for example, searching for the index value. 
[0021 ] When a fallback to the tandem connection oc- 
curs, the usual encoding process including a search for 
the index value is performed. Here, instead of supplying 
the codebook index to the encoding functional unit 52 
during TFO, stochastic code reconstructed from data 
carrying the feature of the stochastic code may be sup- 
plied as will be described later. 

[0022] As shown in Figure 8, the phase of the encod- 
ing operation in the encoding functional unit 52 (the 
phase of the processing unit frame 60) does not gener- 
ally match the phase of the PCM data 62 or the com- 
pression-encoded data frame 64 in the multiplexed sig- 
nal. 



[0023] As shown in Figure 9, synch roneation patterns 
66 are appended to the compressed data embedded in 
the PCM data. Therefore, a FIFO buffer whose length 
is twice the length of the codec processing unit frame is 
provided, as shown in Figure 9, and a compressed data 
frame is extracted by scanning through the data for the 
synchronization patterns. The difference between the 
boundary of the frame thus extracted and the codec 
processing unit frame is extracted as time difference in- 
formation 68 (Figure 8). In Figure 8, the trailing end por- 
tion of the compression-encoded data remaining to be 
transmitted after the end of the processing unit frame 
60 is stored in the buffer for use in the processing of the 
next frame. Likewise, as the PCM data also needs to be 
matched in phase by extracting time difference informa- 
tion 70, the portion corresponding to the time difference 
is stored in the buffer. 

[0024] Figure 1 0 shows an example of how this is ac- 
complished: The PCM data and the compressed data 
demultiplexed by the PCM data/compressed data de- 
multiplexing unit 50 are stored in buffers 70 and 72, re- 
spectively. A buffering control unit 74 extracts the re- 
spective time information, and controls the storing and 
retrieval operations to the respective buffers 70 and 72. 
[0025] Since the frame boundary and the codec 
processing unit frame do not generally coincide with 
each other, a processing delay equivalent to one codec 
processing unit frame could result, in the worst case. On 
the other hand, the codec usually has a processing unit 
called the subframe smaller than the processing unit 
frame. When the buffering control is performed using the 
subframe as a unit, the processing delay can be re- 
duced. This will be explained with reference to Figure 
11 by assuming that the processing unit frame length is 
20 ms and the subframe length is 5 ms. 
[0026] In the frame-by-frame buffering control so far 
described, the data in the area indicated by A in Figure 
11 are held in the respective buffers at time to which in- 
dicates the end of one processing unit frame; therefore, 
the amount of delay is equal to A. According to TS 
28.062, for example, the compressed data frame is also 
divided into units of sub-frames; here, if data arrival is 
detected on a subframe-by-subframe basis, not only the 
PCM data but the compressed data can also be 
matched in phase on a subframe-by-subframe basis, 
eliminating the need for matching the phase for the en- 
tire frame, and the amount of delay can thus be reduced. 
In Figure 11, as the first subframe data is already re- 
ceived at time to, this data is not buffered but is used for 
processing. As a result, the amount of delay can be re- 
duced to B. 

[0027] Further, the codec has a delay called the algo- 
rithm delay; this delay is 5 ms, for example, in the case 
of the AMR, the standard codec in the third generation 
mobile communications. This is implemented as a read- 
ahead buffer in the encoding device, meaning that 5 ms 
of read-ahead is possible. That is, in Figure 11 , at time 
to the second subframe of the compressed data has not 
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arrived yet, but the second subframe of the PCM data 
can be processed for encoding; as a result, the amount 
of delay can be reduced to C. 

[0028] In the case of an ACELP (Algebraic Code Ex- 
cited Linear Prediction) codec, which is a class of CELP s 
codecs, data indicating the positions and signs of the 
pulses forming a stochastic signal is transmitted as sto- 
chastic codebookdata, as shown in Figure 1 2. Then, as 
shown in Figure 13, the stochastic signal is reconstruct- 
ed by a stochastic code reconstructing unit 76, and the 10 
reconstructed data is stored in a buffer 78. 
[0029] As described above, according to the present 
invention, the internal state matching of the encoder 
when switching from the TFO mode to the tandem con- 
nection can be maintained while suppressing the corre- ' is 
sponding increase in the amount of processing. 

Claims 

20 

1 . A speech encoding device comprising: 

means for receiving non -compressed speech 
data and first compressed speech data which 
correspond to the non-compressed speech da- 25 
ta and which are generated through compres- 
sion coding; 

an encoder for generating second compressed 
speech data from said non -compressed 
speech data in a first operation mode; 30 
simplified encoding means for supplying part of 
said first compressed speech data to said en- 
coder and thereby causing said encoder to per- 
form simplified encoding in a second operation 
mode; and 35 
a selector for selecting said first compressed 
speech data for output in said second' opera- 
tion mode, and for selecting said second com- 
pressed speech data for output in said first op- 
eration mode. 1 40 

2. A speech encoding device according to claim 1, 
wherein said encoder generates said second com- 
pressed speech data by code excited linear predic- 
tive coding, and . 45 

said simplified encoding means supplies sto- 
chastic code data to said encoder as said part of 
said compressed speech data. 

3. A speech encoding device according to claim 1 or so 
2, wherein said first compressed speech data is re- 
ceived in the form of a multiplexed signal multi- 
plexed on said non-compressed speech data, and 

said speech encoding device further compris- 
es means for demultiplexing said non-compressed 55 
speech data and said first compressed speech data 
from said multiplexed signal. 



4. A speech encoding device according to claim 3, fur- 
ther comprising means for buffering said first com- 
pressed speech data and said non-compressed 
speech data, respectively, and wherein 

time difference information of said first com- 
pressed speech data and said non-compressed 
speech data with respect to a processing phase of 
said encoder is extracted during said demultiplex- 
ing, and 

based on said time difference information, 
said first compressed speech data and said non- 
compressed speech data are retrieved from said 
buffering means. 

5. A speech encoding device according to claim 4, 
wherein reconstructed stochastic code data is buff- 
ered as the part of compressed speech data. 

6. A speech encoding method comprising the steps of: 

receiving noh-compressed speech data and 
first compressed speech data which corre- 
spond to the non-compressed speech data and 
which are generated through compression cod- 
ing; 

generating in an encoder second compressed 
speech data from said non-compressed 
speech data in a first operation mode; 
supplying part of said first compressed speech 
data to said encoder and thereby causing said 
encoder to perform simplified encoding in a 
second operation mode; and 
selecting said first compressed speech data for 
output in said second operation mode, and se- 
lecting said second compressed speech data 
for output in said first operation mode. 

7. A speech encoding method according to claim 6, 
wherein said encoder generates said second com- 
pressed speech data by code excited linear predic- 
tive coding, and 

in said second operation mode, stochastic 
code data is supplied to said encoder as said part 
of said compressed speech data. 
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